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Developing and validating data records from operational ocean 
color satellite instruments requires substantial volumes of high 

quality in situ data. In the absence of broad, institutionally sup- 
ported field programs, organizations such as the NASA Ocean Bi- 
ology Processing Group seek opportunistic datasets for use in their 
operational satellite calibration and validation activities. The pub- 
licly available, global biogeochemical dataset collected as part of 
the two and a half year Tara Oceans expedition provides one such 
opportunity. We showed how the inline measurements of hyper- 
spectral absorption and attenuation coefficients collected onboard 
the R/V Tara can be used to evaluate near-surface estimates of 
chlorophyll-a, spectral particulate backscattering coefficients, par- 
ticulate organic carbon, and particle size classes derived from the 
NASA Moderate Resolution Imaging Spectroradiometer onboard 
Aqua (MODISA). The predominant strength of such flow-through 
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measurements is their sampling rate— the 375 days of measure- 
ments resulted in 165 viable MODISA-to-in situ match-ups, com- 
pared to 13 from discrete water sampling. While the need to 
apply bio-optical models to estimate biogeochemical quantities 
of interest from spectroscopy remains a weakness, we demon- 
strated how discrete samples can be used in combination with 
flow-through measurements to create data records of sufficient 
quality to conduct first order evaluations of satellite-derived data 
products. Given an emerging agency desire to rapidly evaluate new 
satellite missions, our results have significant implications on how 
calibration and validation teams for these missions will be con- 
structed. 

Published by Elsevier B.V. 


1. Introduction 

Satellite ocean color instruments provide consistent and high-density data on temporal and 
spatial scales that far exceed current field and aircraft sampling strategies, often with time-series of 
sufficient length to allow retrospective analysis of long-term trends. For example, the daily, synoptic 
images captured by the NASA Sea-viewing Wide Field-of-view Sensor (SeaWiFS; 1997-2010) and 
Moderate Resolution Imaging Spectroradiometer onboard Aqua (MODISA; 2002-present) provide 
viable data records for observing decadal changes in biogeochemistry of both global and regional 
ecosystems (McClain, 2009). Briefly, satellite ocean color instruments measure the spectral radiance 
emanating from the top of the atmosphere at discrete visible and infrared wavelengths. Atmospheric 
correction algorithms are applied to remove the contribution of the atmosphere from the total signal 
and produce estimates of remote sensing reflectances (R rs (X); sr _1 ), the light exiting the water mass 
normalized to a hypothetical condition of an overhead Sun and no atmosphere (Gordon and Wang, 
1994). Bio-optical algorithms are applied to the R rs (X) to produce estimates of additional geophysical 
properties, such as the near-surface concentration of the phytoplankton pigment chlorophyll-a 
(C a ; mgm -3 ) and spectral marine inherent optical properties (lOPs), namely the absorption and 
scattering properties of seawater and its particulate and dissolved constituents (O’Reilly et al., 1998; 
Werdell et al., 2013). Time-series of these geophysical properties provide unparalleled resources for 
studying carbon stocks, phytoplankton population diversity and succession, and ecosystem responses 
to climatic disturbances on regional to global scales (e.g., Siegel et al., 2013). 

Refining bio-optical algorithms and verifying ocean color satellite data products requires a sub- 
stantial volume of in situ data to ensure their validity on global spatial and temporal scales (Werdell 
and Bailey, 2005; Bailey and Werdell, 2006). Previously, large volumes of high quality data were most 
successfully acquired via institutionally supported programs, such as the NASA Sensor Intercompar- 
ison and Merger for Biological and Interdisciplinary Oceanic Studies (S1MBIOS) activity (Fargion and 
McClain, 2003). During its six-year tenure, SIMBIOS enabled the assembly of 67,000 measurements 
from 1100 unique field campaigns collected by an assortment of 62 international researchers for inclu- 
sion in the NASA SeaWiFS Bio-optical Archive and Storage System (SeaBASS), the permanent archive 
for in situ data obtained under the auspices of the NASA Ocean Biology and Biogeochemistry Pro- 
gram (Werdell et al., 2003). While extremely useful, these data remain heterogeneously distributed in 
time and space— emphasizing seasonal biases (Spring-Fall) and the coastal and North Atlantic oceans. 
In the absence of a coordinated activity, organizations responsible for operational ocean color satellite 
missions, such as the NASA Ocean Biology Processing Group (OBPG; oceancolor.gsfc.nasa.gov) oppor- 
tunistically seek in situ data records to support their algorithm development and satellite data product 
validation activities. 

The Tara Oceans expedition (September 2009 to March 2012) provides one novel opportunity. 
Briefly, Tara Expeditions (oceans.taraexpeditions.org ; Karsenti et al., 2011) conducted a ~91,000 km 
voyage on the R/V Tara over two and a half years to capture a view of the global distribution of 
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Fig. 1. Global distribution of Tara Oceans stations considered in this analysis. Red circles show all available AC-S hourly 15- 
minute bins (N = 1708). Blue squares show MODISA-to-AC-S match-ups (N = 165). Green circles show all available HPLC 
measurements (N = 130). Black squares show MODISA-to-HPLC match-ups (N = 13). (For interpretation of the references to 
colour in this figure legend, the reader is referred to the web version of this article.) 



marine planktonic organisms (Fig. 1). As part of this expedition, a hyperspectral absorption and 
attenuation meter (WETLabs, Inc. AC-S) was outfitted within the flow-through system of the R/V 
Tara (Boss et al., 2013; Slade et al., 2010). Continuous sampling by this inline system alternated 
between whole seawater and seawater that passed through a 0.2 |im filter, providing calibration 
independent measurements of the absorption and attenuation of marine particles along the full cruise 
track (Slade et al., 2010). Ultimately, Tara Oceans collected 454 days of particulate optical properties 
along 70,000 km. Given their broad spatial and temporal distributions, these optical data records 
provide a highly unique resource to support operational ocean color satellite validation and bio-optical 
algorithm refinement activities. 

Here, we evaluate the inline AC-S measurements collected as part of the Tara Oceans for use as 
“ground truth” for the validation of MODISA ocean color data products. To our knowledge, underway 
measurements of particulate absorption and attenuation coefficients have yet to be used to compre- 
hensively evaluate the quality of satellite-derived C a and IOPs, let alone particulate organic carbon 
and phytoplankton community structure. While this requires some modeling to estimate ocean color 
data products of interest, we show that satellite-to-in situ match-ups from these proxy estimations 
fall well within the envelope of standard match-ups that use direct quantification of biogeochemi- 
cal variables (e.g., high performance liquid chromotography (HPLC) determination of phytoplankton 
pigment concentrations). In doing so, we demonstrate the value of flow-through absorption and at- 
tenuation meter systems as resources for accumulating substantial volumes of reliable, (potentially) 
spatially and temporally diverse, and reasonably low-cost data streams for ocean color satellite-to-in 
situ match-up analyses. Furthermore, we explore how continuous, flow-through sampling can assist 
in reconciling the comparison of different spatial scales (and sub-pixel variabilities) that confound 
standard satellite-to-in situ match-ups analyses (Bailey and Werdell, 2006). 

2. Methods 

2.1. Tam oceans in situ data 

The R/V Tara hosted an inline system within its forecastle bilge that included a WET Labs, Inc. 
AC-S instrument and Sea-Bird Electronics SBE45 MicroTSG unit, as described in detail in Boss et al. 
(2013). With regards to the former, flowing seawater from 2 m below sea level entered the system 
at a Vortex debubbler before a three-way electrically actuated valve that sent the flow either directly 
to the AC-S instrument or through a 0.2 p,m cartridge filter that preceded the AC-S instrument. We 
processed all data following the methods of Slade et al. (2010), which included residual temperature 
and salinity corrections. We calculated spectral particulate absorption ( a p (X ) = a un ^/ teref ;(7.) — 
Qfitered(^); m -1 ) and attenuation (c p (X) = c un f,i terec i(X) — Cfn terec i(X); m -1 ) by interpolating between 
the filtered readings when unfiltered seawater was measured and performed a residual temperature 
correction to account for possible slight differences in temperature between the filtered and unfiltered 
samples (Slade et al., 2010). This provides calibration-independent estimates of a p (X) and c p (X), 
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as instrumental drifts and residual calibration errors persist in both the filtered and unfiltered 
measurements and can therefore be removed by subtracting the former from the latter (as long as 
the instrumental drift has a significantly longer timescale than the timescale of switching between 
measurements). We averaged all data into one-minute bins to suppress the high frequency variability 
(instrument noise plus sample inhomogeneity) that can often mask any low frequency variability of 
interest. This ultimately resulted in over 310,000 final spectra of absorption and attenuation over 375 
days (attenuation was available for 375 of 454 days (Boss et al., 2013)). 

We collected near-surface water samples for HPLC pigment analysis at 130 locations along the 
cruise track of the R/V Tara. For each sample, we vacuum-filtered 2 L of water through 25 mm (in 
some cases, 47 mm) diameter Whatman GF/F glass filters with 0.7 p,m capacity. We stored samples in 
liquid nitrogen, then at — 80°C, until their analysis at the Laboratoire d’Oceanographie de Villefranche 
(LOV; France). At LOV, the filters were extracted in 3 mL (6 mL for 47 mm filters) 100% methanol, 
disrupted by sonication, and clarified two hours later by vacuum filtration. Within 24 h, the extracts 
were analyzed by F1PLC using a complete 1200 Agilent Technologies system according to the protocol 
described in Ras et al. (2008). 

2.2. Modeling of ocean color data products 

Marine lOPs and C a are the principle geophysical variables derived from satellite measurements of 
ocean color. Historically, C a provides the standard climate data record from ocean color satellite time- 
series (Ras, 2011). More recently, the NASA and the International Ocean Colour Coordinating Group 
(IOCCG) invested significant effort in improving remotely sensed retrievals of marine IOPs (Werdell 
et al., 2013; IOCCG, 2006), including those that provide indices of phytoplankton and marine particle 
community structure (Brewin et al., 2011). With the goal of evaluating MODISA-derived C a and 
IOP data records, we generated estimates of C„, the spectral backscattering coefficients of particles 
(fabp(X); m -1 ), and the spectral slopes of £»t> p (A.) (ry, unitless) and c p (X) (y; unitless) from the Tara 
inline AC-S time-series. 

Following Bricaud et al. (1998), C a can be related to spectral absorption of phytoplankton (a p /,(X); 
m -1 ) in the open ocean via a power-law: 

a ph (X)=A(X)C B a w . (1) 

Phytoplankton absorption in the red can be estimated using the line height method of Davis et al. 
(1997) as modified by Boss et al. (2007): 

a p /, (676) = dp (676) - [39/65a p (650) + 26/65a p (715)]. (2) 

As in Bricaud et al. (1998) and Boss et al. (2013), we developed a statistical relationship between log- 
transformed a P h{ 676) and C a . To do so, we identified 52 HPLC samples collected within 1 h of the AC-S 
stations prepared for satellite-to-in situ match-ups analysis (see Section 2.3 below). Analysis of Type 
II linear regression yielded: 

dp/, (676) = A(676)C„ B(676) = 0.0 1 52 Q 0 ' 9055 , (3) 

which corresponds well with a relationship reported for oceanic assemblages of phytoplankton 
by Bricaud et al. (1998) (A = 0.0180; B = 0.816) and for a similar dataset by Boss et al. (2013) 
(A = 0.0160; B = 0.865). The correlation coefficient and root mean square error for our derived 
relationship are 0.88 and 45%, respectively. We produced estimates of C a via inversion of Eq. (3). 
Second, b/, p (X) relates to b p (X) via: 


b bp (X) = b bp (X)b p (X ) 

(4a) 

bbp(^) ~ ^£>p(^)[^p(^) 

(4b) 


where bb P (X) is the dimensionless backscattering ratio that describes the proportion of light scattered 
in the backward hemisphere by particles, and b p (X) is the spectral scattering of particles, defined as 
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the difference between attenuation and absorption ( =c p (X ) — a p (X); m -1 ). Twardowski et al. (2001) 
proposed a spectrally independent relationship between b bp (X) and C a : 

b bp = 0.0096 QT 0 ' 253 , (5) 

which yields values of 0.02, 0.011, and 0.008 for C a of 0.05, 0.5, and 2 mgm -3 , respectively. Note 
that Whitmire et al. (2007) confirmed the spectral independence of Eq. (5) using a diverse in situ 
dataset and reported an average b bp of 0.01. Using C a from Eq. (3) as input into Eq. (5), b bp (X) can be 
therefore be estimated as: 

b bp (X) = (0.0096 QT 0 ' 253 ) [c p (X) - a p (A)]. (6) 

Finally, we calculated the dimensionless power-law slope for spectral particulate backscattering 
(»y) and attenuation (y) as: 


b bp (X) = b bp (X 0 ) [A-Ao] - ’* (7a) 

c P (A) = c p (A 0 ) [A.A o]“ y , (7b) 

using the non-linear minimization approach of Levenberg-Marquardt and X 0 = 440 nm. While the 
power-law function has been found to fit c p (X) well and to be linked to size distribution parameters 
(Eq. (7b); Boss et al., 2001), we acknowledge that the validity of a similar relationship for b bp (X) (Eq. 
(7a)) remains uncertain and requires future research (Slade et al., 2011). 

2.3. Satellite data product validation 

We generated Level-2 satellite-to-in situ match-ups for MODISA using the operational OBPG 
validation infrastructure (seabass.gsfc.nasa.gov/seabasscgi/search.cgi). We prepared the in situ AC-S 
data for comparison with the satellite measurements by generating 15 min averages of C a , b bp (X), y, 
and iy centered on 1 :30 PM (in the time zone local to the R/VTara), which coincided generally with the 
daily local overpass of MODISA. Satellite data processing and quality assurance for these match-ups 
followed Bailey and Werdell (2006). Specifically: (a) temporal coincidence was defined as +/— 3 h; 
(b) satellite values were the filtered mean of all unmasked pixels in a 5 x 5 box centered on the in situ 
target; and (c) satellite values were excluded when the median coefficient of variation for unflagged 
pixels within the box exceeded 0.15. With regards to (a), the satellite-to-in situ time difference never 
exceeded one hour, given our use of averages at local 1:30 PM. We processed MODISA data using its 
R2013.0 (February 2013) reprocessing configuration. We considered the following MODISA-derived 
geophysical variables: 

o C a from O’Reilly et al. (1998) (OC3M); 
o b bp (X) from Werdell et al. (2013); 

o particulate organic carbon (POC; mgm -3 ) from Stramski et al. (2008); and, 
o relative particle size class (PSC; %) from Uitz et al. (2006) and Hirata et al. (2011). 

Note that the operational OBPG version of OC3M includes the modifications presented in 
oceancolor.gsfc.nasa.gov/ANALYSIS/ocv6/. We also downloaded the OBPG operational MODISA-to-in 
situ match-up results for qualitative comparison with those from Tara Oceans. Finally, as MODISA 
maintains a ~1.1 km 2 footprint at nadir, we calculated the coefficient of variation (COV = 
standard deviation/mean) for C„ and a 10 nm bandwidth around b bp ( 650) from the AC-S at 1 km 2 
bins along the cruise track to help assess the role of sub-pixel variability in the MODISA-to-in situ Tara 
Oceans match-ups. 

3. Results 

Direct comparisons of satellite-derived and in situ measurements provide estimates of the accuracy 
and precision of the satellite data products (Bailey and Werdell, 2006). Overall, the MODISA and 
AC-S-derived C„ compared favorably, particularly for C a < 4 mgm -3 (Fig. 2A, Table 1). The slope of 
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Fig. 2. Comparisons of satellite-derived and in situ C a . (A) MODISA versus AC-S C a . (B) MODISA versus HPLC C a . (C) HPLC versus 
AC-S C a , not limited to match-ups with MODISA. In (A) and (B), the black circles show the Tara Oceans satellite-to-in situ match- 
ups, and the gray circles show all available MODISA C a match-ups available from the OBPG and SeaBASS. The solid line shows 
a 1:1 relationship. Table 1 presents complementary regression statistics. 


Table 1 

Statistics for MODISA versus AC-S C a and bb P (X) match-ups determined using Type II linear regression. N, r 2 , Slope (SE), Ratio, 
and MPD indicate sample size, coefficient of determination, regression slope (standard error), median satellite-to-in situ ratio, 
and absolute median percent difference. We calculated all statistics using log-transformed data, with the exception of Ratio 
and MPD. We calculated MPD as the median of [200% * (MODISAj — AC-Sj)/ (MODISAj + AC-Sj) ] for a population of i match-ups. 
The bottom row shows regression statistics for coincident (+/— 30 min) in situ C a derived from the AC-S versus HPLC. 


Product 

N 

r 2 

Slope (SE) 

Ratio 

MPD 

C„ - AC-S 

165 

0.83 

1.08 (0.04) 

1.08 

33.7 

C a - AC-S (< 4 mgm -3 ) 

156 

0.85 

0.98 (0.03) 

1.06 

31.7 

C„ - HPLC 

13 

0.87 

1.03 (0.12) 

1.13 

25.0 

M412) 

167 

0.64 

1.09 (0.03) 

1.04 

22.5 

V 443) 

167 

0.66 

1.12(0.03) 

0.98 

19.6 

M 4 88) 

167 

0.68 

1.15(0.03) 

0.87 

18.6 

M531) 

167 

0.70 

1.16(0.03) 

0.77 

25.8 

V 547) 

167 

0.70 

1.16(0.03) 

0.75 

28.6 

V 6 67) 

167 

0.72 

1.24(0.03) 

0.64 

43.3 

C„ - AC-S vs. HPLC 

52 

0.78 

1.10(0.08) 

1.01 

24.5 


the Type II linear regression between AC-S and MODISA C a was slightly positive (1.08), however, the 
r 2 (0.83) nearly matched that reported by the OBPG for operational MODISA match-ups analyses that 
include all available SeaBASS data (0.88) (Table 2). Visually, the AC-S match-ups fell within the cloud 
of all available SeaBASS match-ups (Fig. 2A). When we excluded MODISA-derived C a > 4 mgm -3 
(less 9 stations of 165 total), the slope reduced to almost unity (0.98) and the r 2 rose to 0.85 (Table 1). 
In both scenarios, MODISA demonstrated a slight positive bias, with median satellite-to-in situ ratios 
of ~1.07 and median absolute percent differences (MPD) of ~32% (the caption for Table 1 presents 
our calculations of MPD). Per Eq. (3), our AC-S estimates of C a directly follow our derived, statistical 
relationship between HPLC estimates of C a and AC-S estimates of a p /,( 676). 

In the satellite ocean color paradigm, the oceanographic community currently considers HPLC- 
derived C a as the state-of-the-art for bio-optical algorithm development and data product validation 
(Ras et al„ 2008; Hooker et al., 2005). Despite the small sample size, the MODISA and HPLC-derived C a 
compared very well, with a slope near unity (1.03), a ratio of 1.13, and a MPD of 25% (Table 1). Visu- 
ally, these match-ups stations fell well within the bounds of the global data set of all available SeaBASS 
match-ups (Fig. 2B, Table 2). The AC-S and HPLC-derived C a complemented each other (Fig. 2), which 
is not surprising given that the latter provided the power-law coefficients provided in Eq. (3) for AC-S 
processing. Despite a ratio of unity, HPLC values from the clearest waters (C a < 0.02 mgm -3 ) im- 
parted a positive slope (1.10) and elevated MPD (24.5%) for this comparison. Eliminating the three 
lowest values (all < 0.02 mgm -3 ) reduced the slope and MPD to 1.05 and 21.8%, respectively. 

Comparisons of MODISA and AC-S-derived bt p (X) largely matched those reported by the OBPG 
(Tables 1 and 2). The r 2 ranged from 0.64 to 0.72, which improved upon the operational OBPG results 
(0.57 to 0.62). The MPD and ratios, however, degraded to ranges of 18.6 to 43.3 and 0.64 to 1.04, 
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Table 2 

Statistics for MODISA versus in situ C a and bb P (A.) match-ups as reported by the OBPG for all available data in SeaBASS 
(oceancolor.gsfc.nasa.gov/seabasscgi/search.cgi) as of June 2013. Methods and abbreviations as Table 1. 

Product 

N 

r 2 

Slope (SE) 

Ratio 

MPD 

Ca 

760 

0.88 

1.02(0.13) 

1.12 

30.9 

V 412) 

63 

0.61 

1.05 (0.09) 

0.94 

17.3 

h P (443) 

63 

0.62 

1.08 (0.09) 

0.94 

17.3 

M 488) 

63 

0.62 

1.11 (0.09) 

0.95 

17.2 

M 531) 

64 

0.62 

1.12(0.09) 

0.97 

17.9 

M 547) 

63 

0.61 

1.14(0.10) 

0.97 

18.5 

V 667) 

63 

0.57 

1.17 (0.10) 

1.00 

19.9 





Fig. 3. Comparisons of satellite-derived and in situ IOPs and POC. (A) MODISA versus AC-S bf, p (443). The solid line shows a 
1:1 relationship and Table 1 presents complementary regression statistics. (B) MODISA POC versus AC-S bb P ( 700) (N = 167). 
The red and green solid lines show the relationships presented in Stramski et al. (2008) and Loisel et al. (2001). (C) MODISA 
POC versus AC-S c p (665) (N = 167). The red and blue solid lines show the relationships presented in Stramski et al. (2008) 
and Behrenfeld and Boss (2006). (For interpretation of the references to colour in this figure legend, the reader is referred to 
the web version of this article.) 


respectively (in contrast to 17.2 to 19.9 and 0.94 to 1.0). Similar to that reported in Werdell et al. 
(2013), the slopes and ratios showed spectral dependence (inverse relative to each other), which 
indicates potential parameterization issues within the remote-sensing model (GIOP-DC; e.g., Raman 
effects are ignored). As for the satellite-to-in situ C a match-ups, the AC-S results visually fall within 
the full dynamic range of operational OBPG results (Fig. 3A). However, the sample sizes for the AC-S 
match-ups exceeded those for the operational OBPG match-ups by almost three-fold (Tables 1 and 2), 
and many of these AC-S match-ups fell well below the lowest bt, p (X) values available in SeaBASS. 
Meso- and eutrophic samples dominate the population of data archived in SeaBASS (Werdell and 
Bailey, 2005), whereas oligotrophic conditions dominate the world’s oceans. 

Many empirical relationships between POC, bt, p (X), and c p (X) have been proposed (e.g., Stramski 
et al., 2008; Loisel et al., 2001; Behrenfeld and Boss, 2006; Cetinic et al., 2012). As in situ POC were not 
collected as part of Tara Oceans, we explored these relationships to evaluate the use of optical data 
as proxy “ground truth” measurements for comparison with MODISA-derived POC. The relationship 
between MODISA POC and AC-S bj, p (700) generally mimicked two common relationships developed 
for near surface waters (Fig. 3B). While other relationships exist (as reported in Cetinic et al. (2012)), 
we arbitrarily chose relationships for the surface layer, as ocean color satellite instruments do not 
“see” far below the first optical depth (i.e., the layer over which light attenuates to ~37% of its 
magnitude at the surface) and a detailed evaluation of POC-IOP relationships exceeded the scope of 
this paper. Over their full dynamic ranges, £>t, p ( 700) and POC behaved similarly to both Loisel et al. 
(2001) (r 2 = 0.76; root mean square error (RMSE) = 0.14 for log-transformed data) and Stramski 
et al. (2008) (r 2 = 0.76; RMSE = 0.14). Likewise, the relationship between MODISA POC and AC-S 
c p (665) followed two common relationships developed for near surface waters (Fig. 3C). Over their 
full dynamic ranges, c p (665) and POC generally followed both Behrenfeld and Boss (2006) (r 2 = 0.80; 
RMSE = 0.14) and Stramski et al. (2008) (r 2 = 0.80; RMSE = 0.15). 
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Fig. 4. Dominant PSCs as determined by Uitz et al. (2006) (top) and Hirata et al. (2011) (bottom). Red, blue, and green circles 
indicate micro-, nano-, and picoplankton, respectively. Black circles indicate stations where a dominant PSC could not be 
determined. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of 
this article.) 


We expect both y and r] provide an index of particle size (e.g., Boss et al., 2001; Behrenfeld and 
Boss, 2006; Loisel et al., 2001, 2006), with lower values indicating larger mean particle sizes. The C a - 
based PSC algorithms proposed by Uitz et al. (2006) and Hirata et al. (2011) provided straightforward 
(easily implemented) methods for estimating phytoplankton sizes from MODISA radiometric data. 
Both report the relative concentrations of micro- (>20 pm), nano- (2-20 pm), and picoplankton 
(<2 pm). We compared average y and rj for the MODIS-derived dominant size classes. We considered 
a station to be dominated by a PSC when the relative presence of that class exceeded 45% (Fig. 4). Under 
this definition, 31 and 16 match-ups stations remained unclassified by Uitz et al. (2006) and Hirata 
et al. (2011), respectively. For the satellite-to-in situ match-ups identified as dominated by micro-, 
pico-, and nanoplankton by Uitz et al. (2006), the AC-S reported average y of 0.42, 0.83, and 0.88, 
respectively (Fig. 5). For the three PSCs assigned by Hirata et al. (2011), the AC-S reported average y of 
0.48, 0.87, and 0.83. Likewise, for the satellite-to-in situ match-ups identified as dominated by micro- 
, pico-, and nanoplankton by Uitz et al. (2006), the AC-S reported average r/ of 0.10, 0.62, and 0.68, 
respectively (Fig. 5). For the three PSCs assigned by Hirata et al. (2011), the AC-S reported average r] of 
0.13, 0.67, and 0.64. The MODIS-derived PSCs, y, and t? all converged to discriminate between stations 
with the largest microphytoplankton and the smaller nano- and picophytoplankton— e.g., Fig. 5 
shows decreasing and y with increasing contributions of microplankton and increasing r] and 
y with increasing contributions of nano- and picoplankton. Effectively discriminating between the 
two smallest PSCs, however, remained ambiguous for both the remote-sensing and in situ optical 
methods. 

Finally, to assess the role of sub-pixel variability in the MODlSA-to-in situ Tara Oceans match-ups 
described above, we calculated the COVs for b p ( 650) and C a for 1 km 2 bins along the R/V Tara cruise 
track (Fig. 6). The median COV for b p ( 650) was 0.012 (equivalent to 1.2%), with a 99th percentile of 0.1 
(10%). The median COV for C a was 0.06 (6%), with a 99th percentile of 0.4 (40%). Recall that MODISA 
maintains a 1.1 km 2 pixel footprint at nadir. In practice, this exercise demonstrated minor within- 
satellite-pixel variability along the Tara Oceans cruise track and suggested that sub-pixel variability 
cannot fully explain the mismatch between satellite-derived and in situ measurements for 99% of the 
match-up stations considered in this analysis. 
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Fig. 5. Box and whisker plots for y (top row) and rj (bottom row) versus PSC frequency as determined by Uitz et al. (2006). 
The black boxes indicate the range from the first to third quartiles for frequency bins from 10-30, 30-50, 50-70, and 70%-90%. 
The red lines indicate the median. The blue lines show the range from the minimum to maximum value. The solid green circles 
indicate outliers (defined as more than 1.5 times the lower or upper quartile). (For interpretation of the references to colour in 
this figure legend, the reader is referred to the web version of this article.) 




cov c a 


Fig. 6. Coefficients of variation (COV; unitless) of AC-S measurements merged into 1 km 2 spatial bins for fa p (650) (A) and C„ 
(B). Sample size is 36,000. The median COVs for b p ( 650) and C 0 are 0.012 and 0.06, respectively. The 99th percentiles for b p (650) 
and C a are 0. 1 and 0.4, respectively. 


4. Discussion 

Developing and validating biogeochemical data records from operational ocean color satellite 
instruments requires substantial volumes of high quality in situ data (Werdell and Bailey, 2005; 
Bailey and Werdell, 2006; Fargion and McClain, 2003). Given their potentially large temporal and 
spatial scales, underway expeditions such as Tara Oceans offer significant potential to increase data 
volumes, provided that data collected on these expeditions are relevant to and of sufficient quality for 
ocean color satellite calibration and validation activities. We showed how the inline measurements of 
hyperspectral absorption and attenuation coefficients collected onboard the R/V Tara can be used to 
evaluate MODISA estimates of C a , bt, p (X), POC, and PSCs. With regards to such validation activities, the 
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most obvious strength of such flow-through measurements remains their sampling rate. The 375 days 
of AC-S measurements resulted in >165 viable MODlS-to-in situ match-ups (Table 1). Furthermore, 
the continuous sampling provided sufficient data to confirm that sub-pixel variability does not drive 
mismatches between the satellite and in situ measurements (Fig. 6). A clear weakness, however, 
remains the need to apply bio-optical models to estimate biogeochemical quantities of interest (e.g., C a 
from a p (X) and c p (X)). Our results indicate this can be done somewhat reliably and effectively, 
however, they also point to model assumptions and parameterizations that can be improved upon. 

Chlorophyll-a can be reliably estimated using a p (X) via the line-height method (Eq. (2); Boss et al., 
2007), given reasonable estimates of how C a varies with a P b(X) (Eq. (3); Bricaud et al., 1998). We 
derived a cruise-specific relationship between C a and a P b(X), and realized MODISA-to-in situ match- 
ups of comparable quality to those from HPLC-derived C„. The number of AC-S match-ups exceeded 
that for the HPLC match-ups by 100-fold (165 versus 14), which reinforces the benefit of inline time- 
series when in need of high volumes of data (say, during the first year of a new satellite mission). 
However, improved results might be achieved with refined line-height methods (Roesler and Barnard, 
in this volume) and progressively more robust estimates of C a from a p i,(X). Given the near consistency 
of bt p in the open ocean (Twardowski et al., 2001; Whitmire et al., 2007), bt, p (X) can also be reliably 
estimated from b p (X). Not only did the MODISA-to-AC-S match-ups of b/, p (X) modestly agree with 
those reported by the OBPG for all available SeaBASS data (Tables 1 and 2; Fig. 3A), but the sample 
size of the former exceeded the latter three-fold. While b b P (X) modeled from a p (X) and c p (X ) remains 
several steps removed from the direct measurement of bb P (X) (Twardowski et al., 2001), our results 
indicate they are of sufficient quality for use in ocean color validation activities, at least early in a new 
mission for preliminary assessment of remotely-sensed variables. 

Many published relationships between POC, bb P (X), and c p (X ) exist, however, most were developed 
using spatially-limited datasets and with varied in situ measurement protocols (Cetinic et al., 2012). 
But, a paucity of POC measurements exists in SeaBASS (although, we acknowledge that larger datasets 
exist elsewhere), indicating a need to pursue optical proxies for use in satellite validation activities. 
Our AC-S-derived lOPs provided a reasonable first-order verification of satellite-derived POC (Fig. 3). 
The relationships between POC-and-b(,p(A) and POC-and-c p (7.) both largely followed previously 
published relationships for the near-surface ocean. Improved agreement between satellite POC and in 
situ optics might be realized, however, if discrete in situ measurements had been made of POC as part of 
Tara Oceans. As we did for the AC-S-derived C a using HPLC analyses, such discrete measurements could 
be used to tune POC-to-lOP relationships for the expedition cruise track. As for POC, large volumes 
of direct measurements of PSCs remain uncommon in most public databases. To overcome this, the 
ocean color community commonly uses HPLC-derived pigments as proxy indicators of phytoplankton 
community structure (see, e.g., the discussion of diagnostic pigment analyses presented in Uitz et al. 
(2006) and Hirata et al. (2011) and references therein). But, the optical parameters r/ and y provide 
alternate proxy indicators of particle sizes (Boss et al., 2001; Loisel et al., 2006). On average, the 
underway AC-S estimates of r) and y appear to effectively discriminate between the largest and 
smallest particles (microphytoplankton versus nano- and picoplankton) (Fig. 5). The use of these in situ 
proxies for PSCs remains an ongoing, unresolved issue (both using HPLC and optics), as does achieving 
convergence in remote-sensing methods to estimate PSCs (e.g., Brewin et al., 2011; Fig. 4). Our results 
contribute to an emerging body of research on the validation of remotely sensed PSCs, however, 
significant work remains to truly understand relationships between varied indirect estimates of 
particle sizes. 

We did not engage in this activity to ultimately propose that underway measurements of 
IOPs replace discrete biogeochemical measurements or vertical profiles in ocean color algorithm 
development and satellite data product validation activities. Rather, we hoped to demonstrate that 
underway measurements provide a previously unexploited means for achieving substantial volumes 
of complementary, high quality data for use in these activities. Tuning the AC-S estimates of C„ to 
sparsely and discretely sampled HPLC-derived C a during the Tara Oceans expedition extended its 
sample size of satellite-to-in situ match-ups from 13 to 165. Furthermore, in the absence of a broad, 
institutionally supported program such as SIMBIOS, opportunistically outfitting sea-going vessels (of 
any purpose) with an inline system could provide high volumes of data with low costs relative to other 
data acquisition strategies. For example, for the 18-month overlap between the SIMBIOS Program 
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and the MODISA mission (mid 2002 through 2003), the sample size of MODISA-to-in situ C a match- 
ups averaged 155 per year. In the four years that followed the conclusion of S1MB10S (2004-2007), 
this rate fell to 104 C„ match-ups per year. In three of those years (2005-2007), the average rate fell 
further to 54 per year. In contrast, in the two and a half year overlap between Tara Oceans and MODISA, 
the sample size of MODISA-to-AC-S-derived C a match-ups averaged 66 per year. In our current era of 
imposing rapid validation requirements on new satellite missions - and our community-wide need to 
develop remote sensing algorithms and validate satellite data records on unprecedented temporal and 
spatial scales - our results have significant implications on how forthcoming calibration and validation 
teams for existing and upcoming satellite missions will be constructed. 
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